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ABSTRACT 

This paper describes a Bayesian statistical method for determining the ge- 
netic basis of a complex genetic trait. The method uses a sample of unrelated 
individuals classified into two groups, for example cases and controls. Each group 
is assumed to have been genotyped at a battery of marker loci using a labora- 
tory effort efficient technique called DNA pooling. The aim is to detect and 
map a quantitative trait locus (QTL) that is not one of the typed markers. The 
method works by conducting an exact Bayesian analysis under a number of sim- 
plifying population genetic assumptions that are somewhat unrealistic. Despite 
this, the method is shown to perform acceptably on datasets simulated under 
a more realistic model, and furthermore is shown to outperform classical single 
point methods. 

1. Introduction 

For many traits of interest, including susceptibility to many genetic diseases, the genetic 
basis is complex, meaning that many genes of individually small effect (quantitative trait loci; 
QTLs) contribute. For detecting and mapping such QTLs, association mapping studies that 
use large samples of essentially unrelated individuals may have two advantages over linkage 
studies that use pedigrees or families: For a given sample size, association studies may have 
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more power to detect a QTL (e.g. Risch and Merikangas 1996, Risch 2000), and they may 
allow the QTL to be fine mapped with greater resolution or precision (e.g. Terwilliger 1995, 
McPeek and Strahs 1999). One important experimental design is a genome wide scan, in 
which many thousands of markers covering the whole genome are typed (see e.g. Lander 
1996, Risch and Merikangas 1996, Kruglyak 1999, Carlson et al. 2003). The aim may be to 
detect QTL that are not candidate genes and that have escaped detection in linkage studies. 
After preliminary analysis, attention may focus on relatively small chromosomal regions, 
each containing perhaps only one QTL. Because of nongenetic factors and the effects of 
other genetically distant QTLs, each focal QTL will explain only a small fraction of the 
variance in phenotype. If the trait is binary (such as presence or absence of a disease), the 
difference in QTL allele frequencies between the two trait groups will be small. The large 
numbers of markers required for a genome wide scan will need to have been typed in large 
numbers of individuals to detect such a QTL, and to infer its position, frequency, and effect 
on the trait. 

In order to reduce the cost of such a study, an experimental strategy called DNA pooling 
has been proposed (e.g Arnheim et al. 1985, Barcellos et al. 1997, Sham et al. 2002, Norton 
et al. 2004). Here DNA from individuals with similar phenotypes is physically mixed together 
into a pool before genotyping. After overheads to do with construction of pools and assay 
development, the cost of genotyping an entire pool at a given marker is reduced to the cost of 
genotyping a single individual. Thus costs may be reduced by a factor close to the number of 
individuals in a pool (divided by the number of experimental replicates used for each pool) , 
when the overheads can be spread across many markers and across many disease studies. 

In this paper I consider the simplest experimental design where there are two trait 
groups. These could have been classified by the presence or absence of binary trait such 
as a disease. Alternatively, if the trait is quantitative, individuals from each tail of the 
trait distribution could make up the two groups, for example the upper and lower 10% tails 
(e.g. Darvasi and Soller 1994, Bader et al. 2001). For simplicity I will refer throughout the 
paper to the two trait groups as cases and controls, and to one of the alleles at the QTL 
as the disease allele. I note that data collected from more than two groups can be analysed 
using the method developed here, by discarding or combining data from some of the groups. 
(An extreme example is where each pool contains two chromosomes, i.e. genotype data are 
available.) Such an approach is valid and may be useful, but will almost certainly not make 
most efficient use of the available data, since it will be based on only a marginal observation 
that is almost certainly not sufficient. 

Compared with individual genotyping, a DNA pooling strategy incurs three types of 
information loss and error. At each marker only the marginal counts of the alleles present 
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within each pool are are available, and so there is (i) no information about deviations from 
Hardy- Weinberg proportions within each marker within each pool (Risch and Teng 1998) 
and (ii) no information about phase or linkage information across markers within each pool 
(Johnson 2005). Further (iii), the marginal counts are only estimated from some kind of 
quantitative genotyping experiment (e.g. Germer et al. 2000, Le Hellard et al. 2002), rather 
than by counting individual genotyping experiments. 

The present approach deals with (iii) by allowing a quite general class of models for 
errors in allele frequency estimation. It attempts to deal with (i) and (ii) by using the full 
likelihood given the available data. This likelihood does however assume a model that makes 
simplifying assumptions that are not as realistic as one would like. This may be an acceptable 
price to pay, because it allows all the necessary computations to be done analytically or using 
simple numerical algorithms. The aim is to develop a method that can be used on very large 
data sets, with hundreds of cases and hundreds of controls typed at hundreds of markers. 

The simple model used here is a special case of the model introduced by McPeek and 
Strahs (1999) and studied by Morris et al. (2000), Zhang and Zhao (2000, 2002) and Liu et al. 
(2001) for haplotype and genotype data. In brief summary, I assume a unique mutation event 
(perhaps a single nucleotide polymorphism, or a deletion) generated the disease allele at the 
disease QTL, that there is a star shaped genealogy since that mutation, that the disease 
allele is absent from the control group, and that Hardy-Weinberg and linkage equilibrium 
apply in the control group. 

Avoiding the use of more comphcated algorithms such as Markov chain Monte Carlo 
(MCMC) means that there are no concerns about mixing and convergence, and no difficul- 
ties with computing normalising constants and Bayes factors for model testing and model 
comparison. When the model is correct, the Bayes factor is (in various classical senses, 
see O'Hagan and Forster 2004 ch.5) an optimal test statistic for detecting a QTL (see also 
Patterson et al. 2004). Therefore the present approach can be viewed as choosing an approx- 
imate model that is simple enough to allow an exact and optimal analysis. It will therefore 
complement alternative approaches that perform approximate or sub-optimal analyses for 
more realistic models. 

One example of such an alternative approach is to perform a classical single point 
analysis, which applies a separate test to the data for each marker. Such an approach can be 
made essentially free of any population genetic model, meaning that no worring assumptions 
are made but also that there is no efficient framework for combining analyses across many 
markers (e.g. to correct for multiple testing). When genotype data are available and when 
the QTL is not one of the typed markers, such single point methods are known to lack power 
(e.g. Risch 2000, Mott et al. 2000, ZoUner and Pritchard 2005), and may produce inefficient 
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point estimates and inefficiently large region estimates for the position of the QTL (e.g. 
Morris et al. 2002, 2003). 

One justification for the method developed here is that although the model is simple, 
it is to my knowledge the most complex model for which a Bayesian analysis has been 
implemented using data from DNA pools. To provide further justification and reassurance 
about the simplifying assumptions made, I have tested the method on data simulated under 
a more realistic full coalescent model. The (necessarily classical) measures of performance 
are encouraging in three respects. Firstly, the power to detect a QTL is substantially higher 
than for classical single point analysis. Secondly, point estimates for the position of the 
QTL derived from the present method outperform the simple procedure of choosing the 
map position of the marker with the most significant single point test result (the "minimum 
p- value method", e.g. Kaplan and Morris 2001a,b). Thirdly, after "flattening" the posterior 
using a factor (derived by McPeek and Strahs 1999) that corrects for the fact that the true 
genealogy is not in fact star shaped, credibility intervals for the position of the QTL cover its 
true position with frequency equal to their nominal size. That is, they are well calibrated. 

The structure of this paper is as follows. In section 2 I review the model introduced 
by McPeek and Strahs (1999), for which an exact Bayesian analysis can be performed using 
multilocus estimated allele frequency data, collected using DNA pools. In section 3 I describe 
in detail how the computations for such an analysis are performed. In section 4 I review a 
more realistic coalescent model that I have used to generate simulated data sets on which to 
test the current approach. In section 5 I describe the performance of the method developed 
in sections 2-3 on those simulated data sets. In section 6 I summarise the results and discuss 
how the method could be improved. 

2. The Simplified Model and Prior 

The main notations and abbreviations used are listed in table 1, and the conditional 
independencies of the variables are represented in figure 1. 

I assume the data (collectively denoted y) consist of allele frequency estimates at L single 
nucleotide polymorphisms (SNPs), obtained from a single pool of case chromosomes and 
a single pool of ric control chromosomes. Let nii be the known map position of the i-th SNP. 
Let the alleles at each SNP be arbitrarily labelled and 1, with ^d,j(l) = "^d — yd,j(0) the 
estimated count of the 1 allele at the i-th SNP in the cases, and ^c,i(l) = — VcA^) the 
estimated count of the 1 allele at the i-th SNP in the controls. When there is no ambiguity, 
y will mean the estimated count of the 1 allele at some SNP in some pool that is to be 
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inferred from the context. (This can easily be generahsed to let yd,i be arbitrary vector 
valued information about the counts of the two alleles at the i-th SNP in the cases, etc.) 
The method may be generalised to allow more than two alleles at each marker. 

Let yd,i(l) and yc,j(l) be the true counts of the 1 allele at the i-th SNP in the cases and in 
the controls respectively. I assume that an error model Pr {yd,i, yc,i\yd,i, yc,i) has been chosen 
and parameterised, for example from a calibration data set consisting of pairs {y, y) that were 
obtained from a given set of individuals by genotyping them as a pool and by genotyping 
them individually. Note that the error model cannot be factorised unless we assume that 
the errors in the case and control pools arc unconditionally independent. However, wc may 
believe that the data were obtained using assays that vary in precision across SNPs. Letting 
Cj indicate the unknown precision of the assay used for the i-th SNP, we could model these 
beliefs as 

Pr {yd,i: yc,i\yd,i: yd) = XI (yd,i|yd,i, e^) Pr {yc,i\yc,i: Pr (a) , (1) 

that is, the error distribution is modelled as a mixture distribution where each component of 
the mixture factorises. In the following, reasonable flexibility in the nature of this error model 
is allowed: the errors must be independent across SNPs but they need not be independent 
across pools and they do not need to be identically distributed across SNPs. A example 
error model is given in section 4. 

Let the unknown map position of the disease QTL relative to the leftmost marker SNP 
be n; often this will be the variable of main interest. I allow any prior Pr (/x); obvious choices 
would be a uniform density on physical distance, or one based on known gene density (see 
e.g. Rannala and Reeve 2001). I assume a unique mutation event generated the disease allele 
at the disease QTL, so some number of haplotypes {x^) in the case pool carry the disease 
allele and are identical by descent (i.b.d.) at this position. I assume ~ Bin(nd,p) for a 
rate p, and since p itself is uncertain I assume a beta prior with parameter R = (i?i, Rq). so 
the prior mean is Ri/{Rq + i?i). Specifying (i?i,i?o) = (1) 1) specifics a fiat prior on [0, 1] 
for p. 1 assume that the disease allele is absent from the control pool. The adequacy of 
this approximation, and ways in which it could be relaxed, are taken up in the discussion in 
section 6. 

Each disease allele is embedded in a block of i.b.d. haplotype, the "ancestral haplotype", 
with breakpoints at distances dj^ and c/r to the left and right of the position of the QTL. 
I assume a star shaped genealogy for the disease allele, so that c^l and c?r for all blocks of 
ancestral haplotype are conditionally independent and identically distributed (i.i.d.) from an 
exponential distribution with mean l/r Morgans, where r is the age of the disease allele in 
generations and distances are measured in Morgans (McPeek and Strahs 1999, see also Morris 
et al. 2000, Zhang and Zhao 2000, 2002, Liu et al. 2001). (The left and right breakpoints 
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are the positions of the nearest crossovers in r meioses where crossovers occur as a Poisson 
process with rate 1.) I allow any prior Pr (r); lognormal and gamma are reasonable choices. 
Specifying an exponential prior with sufficiently small parameter T (so the prior mean 1/T is 
sufficiently large) specifies a prior that is effectively flat over the region where the likelihood 
is non-negligible. This has the effect of making the posterior model probability or Bayes 
factor proportional to T. A crucial variable in the analysis is Xi, the number of chromosomes 
in the case pool that carry the i.b.d. haplotype at the i-th SNP. The assumptions above 
specify a distribution for the Xj. A key feature of the present method is that the Xi are not 
independent across SNPs, in constrast to what is assumed in composite likelihood methods 
(e.g. TerwiUiger 1995, Xiong and Guo 1997, Collins and Morton 1998, Maniatis et al. 2004, 
2005). 

All other non-i.b.d haplotype is called heterogenous "non- ancestral haplotype". The 
allele present in non-ancestral haplotype is assumed conditionally independent across chro- 
mosomes within SNPs, and across SNPs within chromosomes, with 71^(1) the probability of 
a 1 allele at the i-th SNP and 7rj(0) = 1 — 7rj(l). (This is equivalent to assuming Hardy- 
Weinberg and linkage equilibrium in blocks of non-ancestral chromosome.) This assumption 
is a very bad one when haplotype or genotype data are available (Liu et al. 2001, Morris et al. 
2002, Li and Stephens 2003), but when only multilocus allele frequency data are available it 
may be more innocuous; this assumption leads to a massive simplification in the likelihood. 
I assume an independent beta prior for each 7ri(l), with parameter Pj = (Pj i,Pj^o), so the 
prior mean is Pi,i/(Pj,o + Pi,i)- The method should be robust to misspecification of Pj as 
long as both elements are much smaller than Uc- 

The allele present on the ancestral haplotype at the position of the i-ih. SNP, a^, is 
assumed to be a single draw from the same distribution as for a block of nonancestral 
chromosome. That is, each Oj is an independent Bernoulli variable with parameter 77^(1). 

3. Analysis 
3.1. Overview 

The purpose of the analysis is to compute the posterior distribution for quantities of 

interest. Here I focus on computing the Bayes factor in favour of a model in which a 
QTL is present, versus a model with no QTL. This Bayes factor allows the posterior model 
probabilities to be computed for any prior on the two models. 1 also compute the posteriors 
for /i, r and p, given that a QTL is present, marginalising all other variables. 

The model has a hierarchical structure as shown in figure 1. Note that the joint distri- 
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bution of the Xi depends on ji, r and p, but that the probabihty of the data at the i-th SNP 

only depends on Xi and other variables tt, and a, for which the prior is independent across 
SNPs. The first step of the analysis, in section 3.2, is to perform an independent calculation 
for each SNP that integrates out TTj and aj conditional on Xj. The second step of the analysis, 
in section 3.3, is to integrate out all the Xi and x^ conditional on /i, r and p. This gives the 
posterior density for these three variables. The final step, in section 3.4, is to compute the 
marginal posteriors for each of p, r and p, and the normalising constant or probability of 
the data given models with and without a QTL. 



3.2. Calculations for the i-th SNP 

Note that all probabilities in this section are conditional on Xi, the number of chro- 
mosomes in the case pool carrying the ancestral i.b.d. chromosome. At the i-th SNP let 
yd,i{^) = n^i — l/d,i(0) be the (unknown) actual count of the 1 allele in cases, and 2/c,i(l) = 
ric — yc,i(0) be the actual estimated count of the 1 allele in controls. Let yd,i — (z/d,i(l)) yd,i(0)) 
and a similar notation apply for controls. Under the modelling assumptions we have 

PY{yd,i\xi,ai,7Ti) = Bin {yd^i{ai) - Xi,nd- Xi,TTi{ai)) (2) 
Pr(z/c,i|7rj) = Bin(yd,j(l),nc,7rj(l)) (3) 

where Bin {x, n, p) is the probability of observing x successes in n independent trials with 
success probability p (and understood to be zero unless Q < x < n). For a more detailed 
motivation of (2) and (3) see Johnson (2005 appendix A) 

Then the probability of the observed data can be written 
Pr (yd,i, yc,i\Xi, Oj, TTj) = X] 5^ (^'i'^' ^c,i|yd,i, yc,i) Pr (yd,iki, t^i) Pr (z/cikj) . (4) 

We can simplify at this stage by writing down the distribution marginal to and TTj 
and moving the respective sum and integral as far inside the expression as possible, to get 

Pr {yd,h yc,i\xi) = ^^yP^ {yd,i, yc,i\yd,i, y^) 
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The innermost sum over can be rewritten 

Fr{yd,i\xi,TTi) = ^Fi:{yd,i\xi,ai,Tii)Pr{ai\ni) (6) 

= X] ( Bin (t/d,i(ai) - + 1, na - + 1, 71^(0^)) 

V?/d,i(ai) - a:;^/ / V?/d,i(ai) -^i + lj J 
= X f Bin (yd,i(ai) - a;, + 1, na - + 1, 7ri(ai)) 

rid - + 1 y ■ 

This allows the integral over Hi to be computed analytically when (as assumed above) the 
prior Pr (7rj(aj)) is a beta distribution with parameter {Pi^ai, Pi,i-ai) 

Pr(v.,,y,M = /pr(!/.,,|x„..)Prfc>,)Pr(..)d., (9) 

(nd -Xi + 1)! r(nc + P,,o + 



E 



X 



X 



r(nd - Xi + 1 + He + Pifl + Pi,i) 

r(t/d,»(a») - + 1 + VcMi) + Pt,aJ 

(l/d,i(aj) - Xi + 1)! r(|/c,i(ai) + Pi,aJ 

r(|/d,i(l - ttj) + |/c,»(l - aj) + Pj.l-qJ 

?/d,i(l - ai)\ r(|/c,i(l - ai) + Pi,i-aJ 



^yd.(aO-Xi + l\ 

nd - + 1 y 

Thus the probability of the observed data reduces to 

Pr {yd,i, = XI XI (^d'^' yc,j) Pr (yd,i, ycA^i) (n) 

2/d,i yc,i 

where the first term in the summand is the error model and the second term is given by (10). 

Because yd,j and ijc^i arc fixed and a.^ and Hi have been integrated out, values of (11) 
for each Xi can be precomputed and stored in a small lookup table of size (n^ +1). It is 
therefore feasible to use a comphcated and hopefully reahstic error distribution, as discussed 
above. For many error models, computation can be speeded up without loss of accuracy by 
judicious reduction of the range of the summation in (11). 



9 



3.3. Hidden Meirkov Model for Xu, and the Xi 



Throughout this section, all probabilities are conditional on given values of /x, r and p 
but to make a clearer exposition this is suppressed in the notation. 

Assume for clarity of exposition that is a position within the battery of marker SNPs. 
Let \p\ = min {i : rrii > fi} and lfi\ = max {i : rrii < p} denote the indices of the SNPs to 
the right and left of the QTL. The algorithm works with obvious modifications when is a 
position outside the battery of marker SNPs. 

Under the modelling assumptions, as we move away from the position of the disease 
locus, the number of chromosomes containing i.b.d. ancestral chromosome, Xi, is Markovian. 
The transition probabilities of a (nonstationary and inhomogenous) hidden Markov model 
(HMM; see e.g. Durbin et al. 1998 ch.3), for marker SNPs to the right of p (i.e. < i), are 



Similar equations hold for for marker SNPs to the left of fi, and for Pr (x,;|xp). The emmission 
probabilities of the HMM are given by (11). There is an important difference between 
this HMM and the HMMs of MePeek and Strahs (1999), Morris et al. (2000), Zhang and 
Zhao (2000, 2002) and Liu et al. (2001). Those authors modelled the observed haplotypes 
or genotypes as a set of conditionally independent HMMs, conditional on (in the present 
notation) (oi, . . . , a^), (tti, . . . , tt^) as well as fi, r and p. Thus they had to use expensive 
numerical algorithms (optimisation or MCMC) on the high dimensional space of variables 
on which the HMMs were conditioned. Here I am able to model all the data as a single 
HMM conditional on only p, r and p. I am thus able to integrate out the high dimensional 
(ai, . . . , Ol) and (tti, . . . , til) using an efficient numerical algorithm and am left with only 
a low dimensional space (the space for {p,T,p)) on which I will need to use an expensive 
numerical algorithm. 

We can use the backwards propagation algorithm for HMMs to sum over the hidden 
states {xi, . . . ,xl) (see e.g. Durbin et al. 1998 ch.3, Liu 2001 pp. 28-31). Readers familiar 
with HMMs may wish to skip the rest of this section. 

Define the backwards variables for SNPs at positions to the right of the QTL 



Pr {xi+i\xi) = Bin (xj+i, Xi, exp (-r x (mj+i - rrii))) 



(12) 



b{xi) := Pr {yd,i+i, Vci+i, yd,L, VclIxi) 



(13) 



and for SNPs at positions to the left of the QTL 



b'(xi) := Pr (yd,i, Vci, ^d,i-i, yc,i-i\xi) 



(14) 
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Here backwards is relative to the direction in which the hidden process is Markovian. Equa- 
tion (14) should not be confused with a forwards variable, which are not used in this compu- 
tation. Note also that the backwards variables are functions of which SNP {{) and the value 
of that Xi at that SNP, but that I have adopted a more streamlined notation that I hope is 
unambiguous. 

The backwards variables have the obvious interpretation when the arguments run out 
of range, that 

b{xL) := Pr(nothing|a:L) = 1 (15) 
b'{xi) := Pr (nothinglxi) = 1 (16) 

Using (16) to initialise the backwards variables at i — L, we then proceed to propagate 
leftwards along the chromosomes for each i — L — 1, . . . , [//] in turn, compute the backwards 
variables for every (xi) using 

K^i) = {xi+i\xi)FT {yd^i+i,yc^i+i\xi+i)b{xi+i) (17) 

Xi+l 

and then terminate the algorithm by computing 

{yd,\^^],yc,\^^, ■ ■ ■ ,yd,L,yc,L\Xf,) = J^Pr(xr^]|x^)Pr(j/d,M>J'c,MkM)K^M) (18) 

for every x^. 

Likewise, using (16) to initialise the b' backwards variables at i = 1 we can propagate 
rightwards along the chromosomes (backwards in the sense that time or space is usually 
considered in HMMs) for i = 2, . . . , [//J and terminating in a symmetric manner to compute 

Pr {yd,u yc,i: ■ ■ ■ , yd,l^li , ^cLmj k^)- 

We then obtain the probabihty of all the data by computing 
Pr(^) = FT{yd,i,yc,i,---,yd,L,yc,L) 

= XI Bin {x^, Ud, p) Pr (j/d,i, yc,i, ■ ■ ■ , yd,L/xj , 

Pr{yd,\n],yc,\n],- ■ ■,yd,L,yc,L\x^)^ (19) 

Restoring the conditioning that has been implicit throughout this section, (19) is Pr (y |//, r, p) , 
the probability of all the data conditional on {p,T,p) and marginal to {ai, . . . jQl) and 

(tti, . . .,'Kl),. 
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3.4. Mcirginal Posteriors for fi, r and p and Model Probabilities 

The posterior for fi, t and p, up to a normalising constant, is obtained simply by 
multiplying (19) by the relevant priors. 

Pr (y, /X, T, p) = Pr (y|/x, r, p) Pr {p) Pr (r) B (p, i?o) (20) 

1 

Pr (p, r, pli,) = Pr {y\p, r, p) Pr (p) Pr (r) B (p, R^) —— . (21) 

Pr (y) 

Summarising this posterior is not entirely trivial for large data sets, because computing the 
posterior at any given point {p, r, p) using the propagation algorithm of section 3.3 takes on 
the order of Ln\ operations (and all addition has to be done in log-space, see e.g. Durbin 
et al. (1998 p. 77-78)), so we cannot rely on being able to make an arbitrarily large number 
of such computations. 

I use Cartesian product quadrature (CPQ; see e.g. O'Hagan and Forster 2004 9.43- 
9.44) to compute marginal posteriors for each of p, r or p, and the normalising constant or 
marginal likelihood Pr {y) assuming there is a disease QTL. CPQ makes the approximation 

Pr(y) = Pr (y, p,r,p) dp dr dp 

^ Y.ilY.^f^t^^\'^'^Ay,N,n,pi) (22) 

3 k I 

where e.g. j indexes a set of design points {pi, p2, ■ ■ ■}, and is a weight associated with 
the j-th design point. (A quantity proportional to) the marginal posterior for any variable 
p, T or p is obtained by omitting the respective sum from (22). When computed using 
priors describing our beliefs about these variables given that there is a QTL in the region of 
interest, the quantity in (22) will be called Pr (y|QTL). 

The choice of design points and weights depends on the region of interest and the prior, 
and the quadrature rule to be used. For example, all calculations in section 5 the region of 
interest is p G (0, 1), measured in Mb or cM. With a uniform prior for p, exponential prior 
for T with T = 1/1000, and flat beta prior for p with Ri = Rq = 1,1 used 100 design points 
for each variable as follows: 

p,- = (j + 1/2) /lOO, wf^ = 0.01, J = 0, 1, . . . , 99 

Tfe = exp (A;/ll), w^^^ = exp(A;/ll)/ll, /c = 0, 1, . . . , 99 
p^ = (£+ 1/2)/100, wf^^Qm, £ = 0,1,..., 99 (23) 
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Here the quadrature rule is very simple and corresponds approximately to the trapezoid rule 
or two point Newton-Cotes rule. 

A simple model with no QTL is to assume there are no blocks of i.b.d. haplotype, Xi — 
for all i. This corresponds to a degenerate prior at p = (or the limits /i — > ±oo or r — > oo). 
The probability of the data under this model, Pr(y|no QTL), is easily computed directly 
from (11). The Bayes factor (BF) in favour of the model with a QTL is then 

Pr(QTL|y) / Pr (QTL) _ Pr(i/|QTL) 
Pr (no QTL|y) / Pr (no QTL) Pr {y\no QTL) ' ^ ^ 

In addition to its Bayesian interpretation, the BF (or any isotonic transformation thereof) 
has good properties, from a classical frequentist perspective, as a test statistic to test the null 
model with no QTL against the alternative with a QTL (O'Hagan and Forster 2004 ch.5, 
Patterson et al. 2004). Tests based on the BF are admissible, which means that no other 
test has greater power for all (/i, r, p). There may be other tests that have greater power for 
some {iJ,,T,p), and are therefore also admissible, but it is "unusual and strange" to find an 
admissible test that is not based on the BF computed using some prior. Furthermore, (up 
to isomorphism) the BF uniquely maximises average power, when the averaging is done with 
respect to the prior for {/i, r, p) used to compute the BF. Of course, this theory only applies 
when the model is correct, but we might hope that the most powerful test for an approximate 
model would be approximately most powerful for a more realistic model. The BF is a sensible 
way to combine information across many markers to produce a single test, and thus avoids 
(or overcomes) the problematic need to correct for multiple testing (Patterson et al. 2004). 

I have also implemented a Markov chain Monte Carlo (MCMC; see e.g. Gilks et al. 
1996, Liu 2001) sampler, which uses the Metropolis-Hastings algorithm to sample from the 
posterior density for (/i, r, p). A proposal distribution that seems to work reasonably well is 
to update a single variable, choosen at random with equal probability, using the following: 

p' ~ N(//,(0.1(mi-mi))2) 
t' ~ G(10,t/10) 

r' ~ B(5r,5(l-r)) (25) 

Kernel based methods (see e.g. Silverman 1986) can then be used to estimate marginal 
posteriors for each of the variables, and these seemed similar to the marginal posteriors 
computed using CPQ in the cases I examined. However, standard numerical methods to 
estimate the model probability Pr (y) from the MCMC output (see e.g. O'Hagan and Forster 
2004 10.46) converged very slowly and did not seem reliable. 

In addition to the ease and reliability with which the normalising constant or BF can 
be computed, CPQ offers substantial advantages over alternatives such as MCMC in low 
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dimensional situations such as this one. There are no concerns about burnin or mixing. 
CPQ makes efficient use of the evaluation of the posterior density at each design point. 
We can investigate sensitivity to prior specification afterwards without redoing much of the 
computation. 

In this particular situation CPQ offers an additional advantage, that by traversing 
the design points in a particular order many of the backards variables computed in the 
propagation algorithm of section 3.3 it be stored and reused, and thus a CPQ algorithm 
with n design points runs much faster than a MCMC algorithm with n samples. The details 
arc as follows: The posterior can be computed on all points on a three dimensional lattice 
most efficiently by traversing the lattice with different values of r in the outermost loop, 
different values of ^ in the middle loop, and different values of p in the innermost loop. This 
is because each time p changes but pL and r do not then only (19) has to be recomputed. Each 
time p changes but r does not then (18) always has to be recomputed, and if the old and new 
values of p are separated by one or more typed SNPs then one or more columns of backwards 
variables also have to be recomputed. If p always increases then the h are all computed first 
and then as the lattice is traversed extra columns of h' are computed and columns of h 
are simply discarded. Changing r means that the transition probabilities (12) change and 
everything has to be recomputed. The run time of the whole algorithm (propagation and 
CPQ) scales quadratically in Ud and linearly in L + d^j, where dfj, is the number of design 
points used for If a small map region is found to be interesting additional design points 
can be added later at moderate cost. 

The disadvantages of CPQ are that we learn little about the posterior until calculations 
have been completed for all the design points, we may belatedly discover that our choice 
of design points was not a good one, that MCMC may be more sensitive to very narrow 
spikes containing substantial probability mass (these will be missed if they fall inbetween 
the design points) and that CPQ will not scale well to higher dimensional spaces which we 
might need to study if we elaborated the model. 

4. Coalescent Simulation Model and Error Model 

In this section I describe a more realistic model that was used to simulate datasets on 
which to test the method described above in sections 2-3. I also describe a specific model 
for errors in allele frequency estimation. Each simulated dataset was generated as follows. 

First I used the mksamples program of Hudson (2002) to simulate a sample of 20,000 
1Mb long regions, assuming the standard neutral coalescent model with population recom- 
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bination rate 4A^eC = 400 (A^e = 10, 000 assuming IcM/Mb), and assuming the infinite sites 
mutation model with population mutation rate ANefi = 10. (This is an unrealistically low 
mutation rate, the idea is to simulate some of the SNPs in the region rather than all of 
them.) Chromosomes were paired at random to generate a sample of 10,000 individuals. 

One SNP with a minor allele frequency between 10% and 20% was chosen as the disease 
QTL. The disease status of each individual was simulated assuming multiplicative risks, so 
that 7oi/7oo = 7ii/7oi = 9- Here 7g is the penetrance (probability of having the disease) 
given genotype G at the disease QTL, with 1 the minor allele. The parameter g is called the 
allelic or genotype relative risk (see e.g. Risch and Mcrikangas 1996). Simulations for this 
paper used values of g = 4 and g = I. The penetrance of the wild type homozygote, 700, 
was set so that the marginal probability of having the disease was 0.02. (Thus the number 
of case chromosomes was random with expectation 0.02 x 10, 000 x 2 — 400.) Data from 
all nd/2 case individuals, and an equal number nc/2 of randomly chosed control individuals, 
were used to make up the two pools. 

Excluding the disease QTL, all simulated SNPs with a minor allele frequency greater 
than 0.05 in the nc/2 individuals in the control pool were analysed, so the number of SNPs 
L was also random. The estimated allele frequencies at each SNP and for each pool were 
either assumed to be known exactly, or assuming that allele frequencies were estimated using 
the lag between kinetic PGR growth curves (Germer et al. 2000), using 

y — X n . (26) 

Here y is shorthand for the estimated count of the 1 allele, n is the number of chromosomes 
in the pool, ACt = C't(l) — C't(O), and Ct{a) is the number of PGR cycles before the amount 
of PGR product for allele a exceeds some threshold level (Germer et al. 2000). The model 
for Pr(y|?/,e) is then as follows: Define the true lag ACj = log2((n — y)/y), which is the 
lag that would give the correct frequency when (26) was used. 1 assume that the observed 
lag ACt averaged across r experiments is normally distributed with mean ACt and variance 
(T^/r, where is the variance in lags across replicate experiments. 

Using the Jacobian 

-ln{2)y{n-y)- (27) 



d{ACt 



n 



we can write down the error model in the form required in (4) for the analysis 



Pr (y\y, n, a^, r) — - — J^. 7^ — , exp 

^ ln(2)y(n-y)^/w7f 



y{n-y) 



21n(2)V/r 



(28) 
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5. Testing the Method 

All measurements of the performance of the Bayesian method described in sections 2- 
3 are based on analyses of datasets simulated under a more realistic model, as described 
in section 4. Results are reported for two situations, either where the allele frequencies in 
each pool are known exactly, or where there are errors in allele frequency estimation using 
(28) and assuming that n, r — 2 rephcates and a — 0.2 PGR cycles are all known. This 
magnitude of error is comparable to those reported by Germer et al. (2000) and Shiffman 
et al. (2004). Using (27) we can say that these parameter values correspond to a "typical" 
error in allele frequency estimate y/n of about y/n{l — y/n) In {2)a/y/2 ~ y/n{l — y/n)0.098 
or, for intermediate allele frequencies, about 2.5%. 

For each situation, allele frequencies known either exactly or estimated with errors, I 
analysed 500 datasets simulated assuming there was a QTL with a genotype relative risk 
(7 = 4, and 500 datasets simulated assuming a null model with no QTL ((? = 1, so the 
penetrances 700 = 7oi = 7ii = 0.02 are all equal). In these simulations, the median number 
of case or control individuals, nd/2 = nc/2, was 200 (interquartile range 191-209, range 
154-248). The median number of SNPs, L, was 28 (interquartile range 24-32, range 12-51). 
These simulations assumed relative risks that are higher, and correspondingly sample sizes 
that are smaller than may be realistic for many studies of QTLs influencing complex genetic 
diseases. This reflects the need to analyse a reasonably large number of simulated datasets 
with the computing resources currently available to me. The mean time to run an analysis on 
a simulated dataset, using GPQ with the design (23) which requires evaluating the posterior 
at 10^ points on a 100 x 100 x 100 lattice, was about 36 minutes on a 2.4GHz Intel® Xeon"""*^ 
processor (totalling about 50 processor days for all the simulated datasets). 

The inferences from the Bayesian method described here are compared against simple 
but widely used classical single point analyses. When allele frequencies in each pool are 
known exactly, a chi squared test can be used on the counts of the two alleles in the two 
pools (see e.g. Clayton 2001), at each marker SNP separately. Visscher and Le Hellard (2003) 
consider how to perform an equivalent test when the errors in allele frequency estimates are 
Gaussian. The relatively small Gaussian errors in ACt simulated here will produce errors 
in allele frequency estimates that are approximately Gaussian (to the extent that (26) is 
linear, and in fact are underdispersed relative to a Gaussian in the direction of extreme 
allele frequencies). Visscher and Le Hellard (2003) show that a "shrunk" test statistic is 
approximately distributed as xfi) ■ This shrunk statistic is equal to the ordinary statistic 
computed using a point estimate of the counts, times a factor 2Vs/ (2K,. + Ve) where Vg is 
the estimated sampling variance of the allele frequency due to sampling a finite number of 
cases and controls, under the null hypothesis of equal allele frequency in cases and controls. 
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and Ve is the variance of the allele frequency in either pool due to experimental error. Using 
(27), for the simulations performed here this shrinking factor is (approximately, for small a) 



2 + (rid + ne)p(l - p) In (2) VVr ' 

where p is the allele frequency estimated under the null hypothesis, i.e. by pooling the case 
and control pools. 

The most widely considered statistics from a classical single point analysis are as follows: 
Let ]9niin be the smallest p- value of the L (shrunk) chi squared tests applied to a given dataset, 
and let /iminp be the map position of the marker with the smallest p-value. 

It is worth emphasising that all the tests described in the following sections concern the 
classical sense performance of statistics computed from the Baycsian analysis. Strictly, the 
Bayesian sense performance can only be tested by conducting a Bayesian analysis assuming 
a more realistic model (or prior) . 



5.1. Power to Detect a QTL 

In this section I compare the power of tests to detect a QTL. I consider two different 
test statistics, and different methods of determining critical regions. The first test statistic 
is 21nBF, twice the logarithm of the Bayes factor (24). The second test statistic is x L. 
Multiplying the smallest single point p- value by L achieves a simple Bonferonni correction 
for multiple testing that makes the critical region independent of L. Critical regions were 
determined either analytically (by arbitrary or approximate methods), or empirically (from 
analyses of datasets simulated under the null model) . I report the performance of tests with 
nominal sizes of a = 0.05 and a = 0.01; a more general comparison is made in figures 2 and 
3. For each test, the true size was estimated using 500 simulations under the null model with 
genotype relative risk g = 1, i.e. where risk is independent of genotype at the QTL. The 
power against an alternative with g = 4 was estimated using 500 simulations. For each test 
I report the estimated size and power, along with exact 95% binomial confidence intervals 
for their values. 

From a Bayesian perspective, 2 In BF > indicates evidence in favour of the model with 
a QTL over the model with no QTL. As tables 2 and 3 show, the test with this critical 
region has small size and reasonable power. However, much more simulation work, for 
different models and combinations of parameters, is required to establish the generality of 
this result. Also, at least from a classical perspective it is desirable to be able to choose a 
critical region according to the size (or perhaps power) that is desired. 
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An arbitrary critical region is 21nBF > 21n(^^). I say arbitrary because this in fact 
guarantees nothing about the classical sense error rate, but does bound the Bayesian sense 
error rate: The posterior probability that there is no QTL is less than a, Pr (no QTL|y) < a, 
when the model, the prior Pr (QTL) = Pr (no QTL), and the prior for (/i, r, p) are correctly 
specified. It can be seen from tables 2 and 3 that these arbitrary critical regions give tests 
with true sizes that are smaller than a. Such tests are therefore conservative. Causes may 
include the simplified model used to compute the BF, the dependence of the BF on the prior 
specification T, and the fact that the critical region bounds the Bayesian sense error rate 
rather than controls the classical sense error rate. The use of these arbitrary critical regions 
21nBF > 21n(^^) entails a loss of power due to the actual size of the test being smaller 
than intended, so a better method for determining a critical region is desirable. 

Assuming goodness of the x^^^^ approximation, with the shrinking factor (29), and using 
simple Bonferonni correction, suggests an approximate critical region pmin x L < a. These 
tests have true sizes equal to or slightly smaller than their nominal sizes, which is expected 
because the Bonferonni correction is conservative. This effect is expected to increase in 
severity as the marker density increases, because there will be a greater number of more 
positively correlated tests. 

Although these respectively arbitrary and approximate methods for determining critical 
regions are not terribly accurate, and cannot be recommended, it is worth noting that there 
is no clear difference in power between the two test statistics, for tests with the same nominal 
size. Since the test based on 2 In BF is more conservative, it might reasonably be preferred. 

It is not very meaningful to compare the power of tests with different sizes. Therefore I 
used simulations to estimate exact critical regions, so that the power of different tests with 
true size a could be compared. For these tests, the estimated size is exactly equal to the 
nominal size, because the same set of simulations are used to compute both values. Although 
the critical region for a = 0.01 is unlikely to be well estimated using only 500 simulations, by 
a simple symmetry argument this procedure still allows a fair comparison across the different 
test statistics. In every case the test based on 2 In BF is substantially more powerful than 
the test based on pmin x L. 

By combining simulations in which there is not and is a QTL, we can view test statistics 
as classifiers, and ask how well they discriminate between the two cases. The receiver 
operating characteristics (ROC) for the two statistics are compared in figures 2 and 3. The 
ROC curves are equivalent to plotting estimated power (= sensitivity) against size of test 
(= 1 — specificity) for all possible tests (in fact, only all tests with non-disjoint critical 
regions). When viewed in this way, it can be seen that the BF based statistic is uniformly 
equal to or superior to the minimum p-value based statistic. The advantage of the BF is 
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greater when there are errors in allele frequency estimation. This may be because, when the 
dataset is less informative, it may be more important to have a model based way to combine 
information across SNPs. 

For comparison, I have also plotted the ROC curves for a test statistic derived from the 
nonparametric likelihood approach that I have described previously (Johnson 2005). This 
nonparametric likelihood ratio (NLR) statistic is defined in the same way as the BF (24), 
using the same value for Pr (^|no QTL), but makes the approximation 

L 

Pr (y IQTL) ~ J] Pr (yd,i, ^c^k*) (30) 

where [x\^x\^ . . . ^x*i) is the set of hidden states in the HMM that maximise the proba- 
bility (30) under an order restriction that they are either a weakly increasing sequence, a 
weakly decreasing sequence, or a weakly increasing then weakly decreasing sequence. This 
order restriction must be true regardless of the shape of the genealogy at the QTL. The NLR 
statistic is not a very good approximation to the BF, in particular because it can never be 
negative. As far as I know, there is no theoretical reason to believe that it should have good 
properties as a test statistic. However, as figures 2 and 3 show, tests based on the NLR are 
superior to tests based on j)-a\m x ^ and are not clearly distinguishable from tests based on 
the BF. For the simulated datasets studied here, once the lookup table of cmmission proba- 
bilities (11) has been computed, computing the NLR is over 10'^ times faster than computing 
the BF. Furthermore, a Viterbi-like algorithm (see Durbin et al. 1998) for computing the 
NLR has time complexity 0(L x nd), compared with the CPQ algorithm for computing the 
BF which has time complexity 0((L -|- d^) x n^). 

It may concern some readers that the critical regions and sizes and powers of tests were 
all estimated while allowing the numbers of cases, controls and marker SNPs all to vary across 
simulations. To interpret results acquired in this way, a formal classical framework would 
require us to view the genotype relative risk g as the single parameter, and the number of 
case chromosomes and number of SNPs L as random variables. It is true that even in such 
a framework we would normally wish to perform tests conditional on the values of ancilliary 
variables such as and L that contain no information about whether there is a QTL in the 
region. However, it is a feature (or weakness) of classical inference that one is often free to 
choose whether to condition on any given variable (but sec e.g. Jayncs 1976, Robinson 1979). 
The present results therefore do have a sound classical interpretation. In any case, the small 
number of simulations performed here do not allow the luxury of estimating critical regions 
conditional on rid or L. The simulation procedure as used reflects a likely feature of real 
datasets, that SNP density will be higher in regions of the genome where the genealogy is 
deeper. To alter the simulation procedure so that all simulated datasets had the same value 
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of L would require the introduction of an ad hoc algorithm to select the L markers to be 
used from a larger number of candidates. 

As shown in figure 4, the null distributions of both test statistics show a negative 
relationship with L. The negative relationship is most pronounced for the 2 In BF statistic, 
for the situation where there are errors in allele frequency estimation. In this case a linear 
regression of 21nBF on L had a slope significantly different from zero {p = 0.010), and if the 
values of 21nBF are partitioned into two groups according to the rank of L, the hypothesis 
that they are drawn from the same distribution can be rejected using a Kolmogorov-Smirnov 
test {p = 0.004). These tests do not detect significant dependence (p > 0.05) for the 21nBF 
statistic when the allele frequencies are known exactly, or for the Pmin x L statistic. 

Although the simulations described here are adequate for demonstrating the superiority 
of the BF based test over the Pmin based test, we should be cautious about extrapolating 
from the current results. In particular, it seems that the arbitrary (21nBF > 21n(i^)) 
or approximate (Bonferonni) critical regions described above will become more conservative 
as SNP number or density increases. Performing tests that are not conditioned on SNP 
number and density will introduce recognisable subset biases (see e.g. Robinson 1979). In 
a real situation, a critical region should be determined using simulations conditioned on as 
many ancilliary statistics of the observed data as possible, although for complex simulation 
models it may be a matter of guesswork which statistics are approximately ancilliary. An 
approach that could be most useful in practice is a variant of the permutation test of Churchill 
and Doerge (1994). This could be applied if there were matched pairs of pools of cases and 
controls, and each pair were typed in separate DNA pooling experiments (Shiffman et al. 
2004). Then the phenotype labels could be permuted within each pair, giving a set of 
equiprobable values for any test statistic under the null hypothesis. Such an approach could 
not be explored here because it would require too much computation. 



5.2. Sensitivity to prior specification 

It is important to appreciate that the Bayes factor does not depend on the prior probabil- 
ities for the two models (QTL or no QTL), but does depend on the priors for the parameters 
within the QTL model. Misspecification of these priors could adversely influence the perfor- 
mance of the BF as a test statistic, and it is important to examine typical levels of robustness 
to the prior. To explore this, I compare the analyses above that used relatively flat generic 
priors to analyses that used priors that were in a way optimised for the simulated datasets 
under consideration. 
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Note that the prior for ji is correct, but that the approximate model here uses two 
other parameters r and p that do not have any direct correspondance to parameters of the 
coalescent model that the data were simulated under. It is therefore difficult to say what 
the best prior is for analysing the simulated datasets. Loosely speaking, we might imagine 
that for any one simulated dataset, in the limit of an infinite amount of informative data 
the posterior for r or p would converge to a single value, which we could call the "best 
approximating" value for that dataset. However, with less than an infinite amount of data 
the posterior mean for either variable would lie somewhere between the prior mean and 
the best approximating value. Thus the distribution of posterior means across simulations 
would be (very loosely speaking) inbetween the degenerate distribution at the prior mean, 
and the distribution of best approximating values. Figure 5 shows that this is indeed the 
case for yU, for which the prior mean is 0.5 and the true correct prior is uniform on [0, 1]. The 
distributions of posterior means for r and p shown in figure 5 suggests a lognormal prior for 
T (with In (r) having prior mean 6.8 and prior standard deviation 0.74) and a beta prior for 
p (with Ri = 3.2 and Rq = 7.8, p having prior mean 0.29). Here I am assuming independent 
priors. Note that this exercise in prior specification was totally ad hoc. 

As can be seen in figures 2 and 3, the ROC of the test statistic 21nBF computed using 
these priors is hardly different better than that using the original priors. The power for tests 
of sizes a — 0.05 and a — 0.01 is not significantly different, based on 500 simulations. This 
suggests that the performance of the BF as a test statistic, for these datasets, is quite robust 
to prior specification within the QTL model. 

Because most of the computation in QPQ can be reused, computing the BF for a dif- 
ferent prior took on average less than three minutes, compared with the 36 minutes required 
to compute the BF for the original prior. 

5.3. Estimation of QTL Position 

Figures 6 and 7 show analyses of four randomly chosen simulated datasets with QTLs 
(gf = 4). These illustrate the fact that these datasets contain only weak information about 
the position of the QTL (or at least that the Bayesian method described here only extracts 
weak information). It nonetheless seems worthwhile to examine how much information is 
present. 

It has been suggested that the map position of the marker with the most significant single 
point test result (i.e. the minimum p- value) would be a "good" point estimate for the position 
of the QTL (Kaplan and Morris 2001a,b). However, I point out that it is asymptotically 
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inadmissible for a model very similar to the one assumed here. This argument considers the 
limit of a QTL of small effect. One can imagine models where the position of the marker 
with the minimum p-value, ftminp, will be tend to become uniformly distributed on (0,1), 
independent of the true value of fi, as the effect of the QTL tends to zero. The estimator fi„imp 
then has expected loss — fi) + ^ under absolute error loss and | — — /u) under squared 
error loss. The estimator /i = | has uniformly lower expected loss, || — /x] under absolute 
error loss and (| — /x)^ under squared error loss. This argument does not technically apply 
for the model simulated here because SNPs (including the QTL) tend to be concentrated in 
regions where the genealogy is deepest, so even completely ignoring the genotype data, the 
position of any SNP is informative about the positions of all other SNPs including the QTL. 
It does however suggest that better point estimates may be found, and suggests what their 
asymptotic behaviour ought to be, at least approximately. 

The performance of different methods for estimating the position of the QTL was as- 
sessed using the 500 simulations with g — 4:, for the two situations with and without errors in 
allele frequency estimation. Due to the nature of the simulations performed here, the errors 
reported are averaged over the distribution of the true value of fi. They are therefore not 
classical expected losses in the strict sense, but expected losses averaged with respect to a 
distribution of parameter values. Bayesian point estimators have uniquely best performance 
when measured in this way (O'Hagan and Forster 2004 ch.5); the theory requires that the 
model and prior are both correct. In particular, under squared error losses the average ex- 
pected loss is minimised by the posterior mean, and under absolute error losses the average 
expected loss is minimised by the posterior median. As shown in table 4, point estimators 
derived from the posterior calculated using the Bayesian method described here are superior 
to /xminp, the map location of the marker with the smallest p-value. When allele frequencies 
in each pool are known exactly the Baysian analysis produces a 21% reduction in root aver- 
age mean squared error and a 13% reduction in average mean absolute error. When there are 
errors in allele frequency estimation the figures are similar, 18% and 11% respectively. The 
nonparametric method developed previously by me (Johnson 2005) produces point estimates 
that are competitive with the Bayesian method under squared error losses. 

Figures 8 and 9 show results from the 500 datasets simulated with g = 4:. The coverage 
of credibility intervals constructed from the marginal posterior for /i falls well below nominal 
levels. This suggests strongly that the simple model used for the analyses is not a good 
approximation to the more realistic model the data were simulated under. One way to 
improve the model would be to allow a more realistic model for the shape of the genealogy 
at the QTL. To explicitly model this genealogy and hence the joint distribution of breakpoints 
between ancestral and nonancestral chromosome would require something like the MCMC 
sampler of Rannala and Reeve (2001) or Morris et al. (2002). Although taking such an 
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approach is highly desirable, it may not scale well to large datasets and it seems worthwhile 
to investigate approximations. 

One approximation is the "pairwise correction" derived by McPeek and Strahs (1999) 
and justified by them by the use of a quasi-score function, and used in a Bayesian context 
by Morris et al. (2000). Essentially, this involves flattening the likehhood function by raising 
all likelihoods to a power = (1 + (n — l)c„)~^ < 1. Here Cn is the pairwise correlation 
over sampled chromosomes of the conditional score function for the position of the QTL. 
An expression for c„ for a coalescent model is given in appendix D of McPcck and Strahs 
(1999). The n in this equation (which c„ also depends on) is the number of chromosomes 
carrying the QTL. It is not at all clear whether or how this correction should be applied in 
the present context, because (i) as noted by Morris et al. (2000) the quasi-score justiflcation 
used by McPeek and Strahs (1999) does not apply in a Bayesian setting, (ii) in the present 
work the likelihood is never written as a conditional product across chromosomes carrying 
the QTL, (iii) it is not known how many chromosomes carry the QTL, and (iv) in my 
computational implementation no proper likelihoods are ever calculated, only likelihoods 
marginal to (tti, . . . ,7rL). However, the following ad- hoc approach does produce corrected 
credibility intervals that achieve coverage very close to their nominal levels. The procedure 
is to first estimate n by ndE(p), the product of the number of case chromosomes and the 
posterior expectation of p, and then to fiatten the marginal posterior for /x by raising it to a 
power Wn and renormalising. For the simulations performed here, Wn had median 0.56 and 
interquartile range 0.48-0.63. When this procedure was applied, good agreemement between 
nominal and achieved coverage is obtained (figures 8 and 9). This suggests that the most 
serious misspecification of the current model is the assumption of a star shaped genealogy, 
rather than the assumption of linkage equilibrium in nonancestral blocks or the absence of 
the disease variant in the control pool. 



5.4. Application to real data 

It is not really possible to examine the effectiveness of the Bayesian method described 
here on real data, due to a lack of relevant published datasets. Primarily for the piuposc 
of comparison with other fine scale mapping methods, I have applied it to the dataset of 
Hosking et al. (2002), and to quasi-synthetic datasets generated from that dataset. Hosking 
et al. (2002) collected data using individual gcnotyping. In order to pretend that the data 
were acquired using DNA pooling, I use a hypergeometric error model to relate the observed 
counts with missing data to the underlying full data that were not observed. This assumes 
the data are missing at random within and across SNPs. 
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To my knowledge, no fine scale mapping method has been published that does not 
perform well on the data of Hosking et al. (2002). Therefore, observing that the present 
method performs acceptably, as shown in figure 10, is not necessarily encouraging. To 
simulate a disease with a complex genetic basis, I generated three datasets by randomly 
relabelling controls with probability 10%, 20% or 30%. As shown in figure 10, on all 

four datasets 95% credibility intervals covered the true location of the CYP2D6 gene after 
the correction factor of (McPeek and Strahs 1999) had been apphed to flatten the posterior. 
This provides weak evidence that the method developed here may be reliable for mapping 
QTLs from real data. 



6. Discussion 

In this paper I have described and tested a Bayesian method for detecting and mapping 
a QTL, using multilocus data collected using DNA pooling within two trait groups. 

Relatively recently, likelihood based fine scale mapping methods have been developed for 
genotype data that build on previous haplotypc based analyses by treating the unobserved 
haplotypcs as missing data and integrating over all possible haplotypes that are consistent 
with the observed genotypes. This integration can be performed either using Markov chain 
Monte Carlo (MCMC) (Liu et al. 2001, Reeve and Rannala 2002, Morris et al. 2003) or 
using exact numerical methods for hidden Markov models (Zhang and Zhao 2002). Data 
from DNA pools arc estimated counts of alleles at each locus with no phase information. 
Fine scale mapping from genotype data and from DNA pools can in theory be regarded as 
closely related missing data problems. 

The approach taken in this paper combines elements of the approaches of Zhang and 
Zhao (2002) and of Morris et al. (2000) and Liu et al. (2001). Like Zhang and Zhao (2002), I 
use a model that is sufficiently simple that 1 can use hidden Markov model (HMM) methods 
to sum over all possible haplotypcs that arc consistent with the observed data. However, 
after computing the likelihood using a propagation algorithm, Zhang and Zhao (2002) then 
maximise that likelihood with respect to the remaining model parameters. In contrast and 
hke Morris et al. (2000) and Liu et al. (2001), I embed the HMM within a fully Bayesian 
approach and compute posterior probability distributions for the quantities of interest. 

One advantage of a Bayesian approach is that probability statements can be made di- 
rectly about quantities of interest. For example, we can state the probability that there 
is QTL in any given region, including the whole region under study. Thus, mapping and 
detecting a QTL are intimately related aspects of the same analysis. They are different infer- 
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ences that are made from the same posterior probabihty distribution. Within the Bayesian 
framework there is no need to choose between a bewildering array of estimators, test statis- 
tics and methods for correcting for multiple testing; the approach has a pleasing simplicity, 
at least conceptually. 

However, the probabilities computed in a Bayesian analysis are only meaningful if the 
model and prior arc realistic. The Catch-22 is that in order to compute Baysian posterior 
probabilities, I had to assume a model that was worringly oversimplified and not very believ- 
able. The present work is therefore best regarded as a step towards Bayesian analysis of data 
collected using DNA pooling. It may be helpful to draw parallels with methods for analysis 
of genotype data (collected using individual typing). Sadly, the present method allows us 
to make inferences assuming a model less elaborate than the one of Morris et al. (2000), 
whereas we might aspire to being able to assume a model like the one of of Morris et al. 
(2002) or ZoUner and Pritchard (2005). However, Bayesian analysis of such realistic models 
has required Markov chain Monte Carlo (MCMC) to integrate over high dimensional spaces 
of auxiliary variables or missing data. Such computationally intensive approaches may have 
difficulty handling large datasets. In contrast, the method described here is relatively fast, 
and large datasets could be analysed with realistic computational resources. For example, 27 
processor- days would be required to analyse data from a whole genome scan with 100 cases, 
500,000 SNPs, and evaluation of the posterior at points 50kb apart. In contrast, ZoUner 
and Pritchard (2005) estimate that their MCMC based procedure for data from individual 
typings would take 85 processor-years for the same scale of analysis. A further advantage of 
avoiding Monte Carlo methods is that the large numbers of analyses needed for a shding win- 
dow analysis, or a permutation test, can be performed without needing human intervention 
to adjust mixing parameters or monitor convergence. Finally and perhaps most significantly, 
I am able to compute a Bayes factor (BF) to compare models in which there is, and is not, 
a disease QTL in the whole region of interest. To my knowledge, no association mapping 
method using genotype data is able to do this, although Patterson et al. (2004) are able to 
compute a BF for admixture mapping using genotype data. 

There is a Bayesian justification for the present method. ("This is the best model for 
which a Bayesian analysis of data from DNA pools is currently possible.") However, serious 
concerns about model inadequacy ("Well, that model simply isn't good enough!") mean 
that, in this paper, I have mostly focussed on the classical frequentist justification. Us- 
ing simulations assuming a more realistic model, I have shown that the present method is 
uniformly superior to classical single point methods of analysis. Single point methods are 
the most obvious way to analyse data collected using DNA pooling, although composite 
likelihood methods (Terwilliger 1995, Xiong and Quo 1997, Collins and Morton 1998, Ma- 
niatis et al. 2004, 2005) could also be used. The simulation results demonstrate that the 
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BF computed using the present method makes a more powerful test for the presence of a 
QTL than the minimum p-value from single point tests, that the posterior density for the 
position of the QTL leads to a better point estimator than the position of the marker with 
the minimum p-value, and that well calibrated credibility intervals can be derived from the 
posterior density for the position of the QTL, after applying the correction of McPeek and 
Strahs (1999). 

The performance of composite likelihood (CL) methods was not examined here. This 
was because no CL method has been developed that allows errors in allele frequency esti- 
mation, and because, to my knowledge, no CL method assumes a model that is obviously 
more realistic than the model assumed by the present method. In particular, all CL methods 
implicitly assume linkage equilibrium in non-ancestral blocks of chromosome. In the nota- 
tion of the present paper, CL methods assume that the number of chromosomes carrying 
ancestral haplotype at the i-th SNP, Xi, is conditionally independent across SNPs. Even a 
poor model that does capture some aspect of the dependence across SNPs, such as the star 
shaped genealogy assumed here, seems preferable. To my knowledge, there is no CL method 
that produces well calibrated confidence or credibility intervals. Perhaps because of this, 
Maniatis et al. (2005) state that "[t]he main objective in positional cloning is to estimate the 
kb location of a causal SNP as accurately as possible, with its support interval an important 
but secondary objective." However, it seems to me that we should focus on methods for 
computing well calibrated credibility intervals, and ideally a well calibrated posterior den- 
sity. The acid test is to ask whether a statistical method informs us about what is a good 
action or decision to be taken subsequently. A point estimate for QTL position, without a 
reliable measure of precision, is not very helpful for planning future experiments to further 
refine the position of that QTL. 

One of the more surprising results is that, in the simulations performed here, the non- 
parametric likelihood ratio (NLR) test derived from the method proposed previously by me 
(Johnson 2005) is basically as powerful as the BP for detecting a QTL. This is surprising 
because there is no theoretical basis for the NLR test statistic, but a strong theoretical basis 
for the BP test statistic. Since the NLR can be computed much more quickly, both in abso- 
lute and complexity terms, its performance in simulations over a wider range of parameters 
will be examined in a subsequent paper. 

Given that the NLR performs as well as the BP for detecting a QTL, but that the BP 
is much more expensive to compute, one might reasonably ask what are the benefits of the 
Bayesian method described here. Pirstly, the BP has a Bayesian interpretation, and since 
it can be negative it can indicate Bayesian sense evidence in favour of there being no QTL. 
The NLR test statistic can never be negative, has no direct Bayesian interpretation, and 
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is not a good approximation to the BF. Secondly, the posterior median from the Bayesian 
method provides superior point estimates under absolute error losses. Thirdly, the Bayesian 
method produces well calibrated credibility intervals, but the profile likelihood method I 
proposed previously does not (see figure 4 of Johnson 2005). Finally, the unconditional cov- 
erage frequencies of credibility intervals say nothing about the conditional or Bayesian sense 
performance of a method. For multistage QTL mapping experiments we should probably 
guide our choice of where to type further markers using the typically complex, heavy tailed 
and often multimodel posterior distributions computed using the Bayesian method described 
here, as exemplified in figure 7. If analysing data from a whole genome scan, I would recom- 
mend a multistage analysis that first uses the NLR statistic to identify regions of interest, 
and the to use the CPQ algorithm to compute Bayes factors and posterior distributions for 
QTL position within those regions. 

Given the large number of simplifications made in specifying the model used here, 
one might wonder why the method works at all. The three most obviously inadequate 
approximations are the star shaped genealogy, the absence of the disease allele in the control 
pool, and the assumption of linkage equilibrium in non-ancestral blocks of chromosome. I 
will briefly discuss these inadequacies in turn. 

Figures 8 and 9 show that credibility intervals only achieve prescribed coverage levels 
when a correction is made for the genealogy not in fact being star shaped. This suggests 
a serious inadequacy of the model. This is further supported by the observation of the 
very similar ROC curves in figures 2 and 3 for the theoretically optimal BF (assuming a star 
shaped genealogy), and the NLR statistic that has no theoretical basis (but allows any shape 
genealogy; Johnson 2005). Addressing this inadequacy is likely to lead to greater power to 
detect a QTL, and perhaps smaller credibility intervals of a given size. However, it will 
be hard to achieve without imposing a substantial computational burden. In particular, it 
may become difficult to compute the BF test statistic if MCMC is used to integrate over 
genealogies at the QTL. 

Although it is conceptually straightforward to allow blocks of ancestral chromosome 
in the control pool, this would increase the number of hidden states at each SNP from 
{rid + 1) to (nj + l)(nc + 1). Since the propagation algorithm (section 3.3) requires time 
that is quadratic in the number of hidden states, the analysis would be intractable using 
the current approach. As an alternative, any number of separate pools could be treated 
as conditionally independent HMMs, but then we would have to integrate over the high 
dimensional space of allele frequencies and ancestral haplotypes using MCMC or importance 
sampling (see below). 

It is possible that the current model adapts to fit there being blocks of ancestral chro- 
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mosome in the control pool, by appropriate adjustment of the allele frequency parameters. 
Ancestral blocks that are explicitly modelled in the disease pool would then represent ad- 
ditional blocks beyond what would be expected according to the adjusted allele frequency 
parameters. If this was so, the parameter p might be best interpeted as representing the 
rate of excess disease alleles in the disease pool. 

Since only marginal observations are available, the assumption of linkage equilibrium 
may be relatively innocuous. Since there is virtually no information in the data about 
linkage disequilibrium, introducing parameters describing linkage disequilibrium into the 
model might have little effect on inferences about the quantities of interest. It is possible 
to retain the present framework where all the data are modelled as a single HMM, but to 
include pairwise linkage disequilibrium by allowing allelic state along each chromosome to 
be a first order Markov chain (see e.g. Liu et al. 2001, Morris et al. 2002). This will be quite 
computationally expensive, but could be examined in the future. 

For the parameters chosen for the simulations performed here, the benefits of the present 
Bayesian method are somewhat modest. It remains unclear whether there would be larger 
benefits for other values of the simulation parameters, in particular more SNPs in the dataset, 
and/or larger benefits from a Baysian analysis with a more realistic model. Clarification of 
both points awaits access to substantial computational resources. It is worth commenting 
that many of the variables in the present model also feature in more elaborate models, and 
therefore the present approach could be used to generate (for example) a joint importance 
sampling distribution for the ancestral haplotype, allele frequencies, and age and position of 
the QTL. 

Even the simulated datascts studied here were generated under a model that lacks 
realism in several respects. For example, in simulating errors in allele frequency estimation I 
have ignored differential amplification of the two alleles, which may cause estimates of allele 
frequencies obtained using DNA pools to be biased. This manifests itself as only a second 
order effect on the difference in allele frequency between case and control pools (Visscher 
and Le Hellard 2003). Differential amphfication can be accomodated easily in the present 
method of analysis, for example by making y^,? ^ vector consisting of data from the pool 
and also from heterozygous individuals or pools of known composition. Even if no data 
from heterozygotes is available, it is possible to compute a Pr {y\y) by integrating over a 
distribution of differential amplification constants, like in the approach of Moskvina et al. 
(2005). 

One feature of the posteriors calculated using the present method (and especially after 
McPcck and Strahs (1999) fiattening) is that they are very heavy tailed, and so large cred- 
ibility intervals (99%, 99.9%) tend to be very wide, perhaps almost as wide as credibility 
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intervals computed from the prior! This suggests that, if a series of fine scale mapping ex- 
periments were conducted using DNA poohng, we would not be making radical reductions in 
the size of the region under study at each stage, but rather would be increasing the density 
of markers in some regions more than others after each stage of analysis. 

Software 

A software package implementing the methods described here is available from the web 
site http://homepages.ed.ac.uk/tobyj/software/ . Source code is available and the 
package can be distributed freely under the terms of the GNU general public licence (Free 
Software Foundation 1991). 
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Table 1. Frequently used notations. 



S>'ml")ol Meaning 



Qi Allele present (0 or 1) on ancestral haplotype at i-th SNP 

b,b' Backwards variables, see (13) and (14) 

B{a, (3) Beta distribution with parameters a and (3 

Bin (n, p) Binomial distribution with parameters n and p 

Bin {x, n, p) Probability of drawing x from a binomial distribution with parameters n 

and p 

BF Bayes factor 

CPQ Cartesian product quadrature 

Number of design points used for in quadrature algorithm 

Cj Precision of assay used to genotype i-th SNP 

E (A) Exponential distribution with rate parameter A (mean 1/A) 

g Genotype relative risk; factor by which disease allele increases penetrance 

or risk 

G (a, (3) Gamma distribution with shape parameter a and scale parameter /3 

HMM Hidden Markov model 

i Index of SNP, i = 1, . . . , L 

i.b.d. Identical by descent 

L Number of SNPs 

rrii Map position of the i-th marker 

MCMC Markov chain Monte Carlo 

rici rid Number of chromosomes in control and case pools respectively 

N {/I, a^) Normal distribution with mean /i and variance cr^ 

NLR Nonparametric likelihood ratio, see (30) 

Pmin Smallest p-values out of L tests in single point analysis 

Pi,a Prior parameter: 7rj(a) ~ B {Pi^a, Pis-a) 

r Number of experimental replicates used to estimate ACt 

R Prior parameter: p ~ B (i?i, Rq) 

ROC Receiver operating characteristics 

T Prior parameter: r ~ E (T) 

Xi Number of chromosomes in case pool carrying ancestral i.b.d. haplotype 

at i-th SNP 

Xfjt Number of chromosomes in case pool carrying ancestral i.b.d. haplotype 

at position of QTL 

l/c,i(a) True count of allele a at i-th SNP in control pool 

yA,i{o) True count of allele a at i-th SNP in case pool 



Table 1 — Continued 



Symbol Meaning 



yc,i{o) Estimated count of allele a at i-th SNP in control pool 

yd,i(a) Estimated count of allele a at i-th SNP in case pool 

y Shorthand for yc,j(l) or yd,i(l) for some i 

y All the data 

a Nominal size (rate of type I error) of a test 

ACt Estimated lag between PGR growth curves used to type DNA pool 

7(3 Penetrance (risk of disease) for genotype G at the QTL 

H Map position of the disease locus 

Aminp Map position of SNP with smallest p-value in single point analysis 

7rj(l) Expected frequency of allele 1 at i-th SNP in non-ancestral chromosome 

p Expected frequency of disease allele in case pool 

(7 Standard deviation of experimental error in estimation of ACt 

T Age of the disease allele 
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Table 2. Performance of tests to detect a disease QTL, when allele frequencies in each 

pool are known exactly. 



Statistic 


Method 


Nominal size 


Critical value 


True size 


Power 


21nBF 


Arbitrary 







0.080 




0.870 










(^U.Uoo, 


n 1 n7^ 
U.iU/ ) 


(0.837, 0.898) 


21nBF 


Arbitrary 


0.05^ 


5.889 


0.010 




0.710 










(0.003, 


0.023) 


(0.668, 0.749) 


Pmin X L 


Bonferonni 


0.05 


0.05 


0.040 




0.720 










(0.025, 


0.061) 


(0.678, 0.759) 


21nBF 


Simulation 


0.05 


0.903 


0.050 




0.842 










(0.033, 


0.073) 


(0.807, 0.873) 


Pmin X L 


Simulation 


0.05 


0.063 


0.050 




0.740 










(0.033, 


0.073) 


(0.699, 0.778) 


2 In BP 


Arbitrary 


0.01^ 


9.19 


0.002 




0.628 










(0.000, 


0.011) 


(0.584, 0.67) 


Pmin X L 


Bonferonni 


0.01 


0.010 


0.000 




0.582 










(0.000, 


0.006) 


(0.537, 0.626) 


2 In BP 


Simulation 


0.01 


5.864 


0.010 




0.710 










(0.003, 


0.023) 


(0.668, 0.749) 


Pmin X L 


Simulation 


0.01 


0.027 


0.010 




0.664 










(0.003, 


0.023) 


(0.621, 0.705) 



^not a nominal size in the classical sense but a nominal upper bound on the Bayesian sense 
error rate 
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Table 3. Performance of tests to detect a disease QTL, when there are errors in allele 
frequency estimation with r — 2 replicates and cr = 0.2 PGR cycles. 



Statistic 


Method 


Nominal size 


Critical value 


True size 


Power 


21nBF 


Arbitrary 







0.080 


0.782 










(0.058, 0.107) 


(0.743, 0.817) 


2 In BP 


Arbitrary 


0.05^ 


5.889 


0.008 


0.532 










(0.002, 0.020) 


(0.487, 0.576) 


Pmin X L 


Bonferonni 


0.05 


0.05 


0.040 


0.560 










(0.025, 0.061) 


(0.515, 0.604) 


2 In BP 


Simulation 


0.05 


0.723 


0.050 


0.746 










(0.033, 0.073) 


(0.705, 0.784) 


Pmin X L 


Simulation 


0.05 


0.085 


0.050 


0.642 










(0.033, 0.073) 


(0.598, 0.684) 


2 In BP 


Arbitrary 


0.01^ 


9.19 


0.000 


0.424 










(0.000, 0.006) 


(0.380, 0.469) 


Pmin X L 


Bonferonni 


0.01 


0.01 


0.010 


0.432 










(0.003, 0.023) 


(0.388, 0.477) 


2 In BP 


Simulation 


0.01 


4.28 


0.010 


0.596 










(0.003, 0.023) 


(0.552, 0.639) 


Pmin X L 


Simulation 


0.01 


0.01 


0.010 


0.432 










(0.003, 0.023) 


(0.388, 0.477) 



^not a nominal size in the classical sense but a nominal upper bound on the Bayesian sense 
error rate 
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Table 4. Performance of point estimators of QTL position. 



Estimator Average expected loss under 

squared error losses absolute error losses 



Allele frequencies known exactly 





0.208 2 


0.120 


Mean {fi\y) 


0.165^ 


0.107 


Median (/x 


0.166 2 


0.105 


NP method 


0.165 2 


0.112 


Errors in allele frequency estimation 




0.2392 


0.146 


Mean {n\y) 


0.1952 


0.132 


Median {iJ,\y) 


0.2032 


0.130 


NP method 


0.198 2 


0.136 



-38- 



List of Figures 

1 Hicrachical or Bayesian network structure of the model. The region inside 
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i = l,2,...,L 




Fig. 1. — Hierachical or Bayesian network structure of the model. The region inside the 
rectangle is duplicated for each of L SNPs. Lines indicate the dependence structure of the 
model: Variables not connected are independent, conditional on all other variables in the 
model. 
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Fig. 2. — Sensitivity vs. specificity for 21nBF (solid line) and pmin x L (dashed line), when 
allele frequencies in each pool are known exactly. The dotted line shows the performance of 
2 In BP computed using the priors obtained from figure 5, and the dot-dashed line shows the 
performance of a nonparametric likelihood ratio test statistic (Johnson 2005; see text). 



-42 - 




Fig. 3. — Sensitivity vs. specificity for 21nBF (solid line) and p^ain x L (dashed line), when 
there are errors in allele frequency estimation with r — 2 rephcates and a — 0.2 PGR cycles. 
The dotted line shows the performance of 2 In BP computed using the priors obtained from 
figure 5, and the dot-dashed line shows the performance of a nonparametric likelihood ratio 
test statistic (Johnson 2005; see text). 
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Fig. 4. — Sampling distribution of test statistics 2 In BF (top) and pmin x L (bottom, on log 
scale) under null model [g = 1), as functions of L, the number of SNPs in the simulated data 
set. The 0.95 and 0.99 quantiles are shown as solid lines. The least squares linear regression 
is shown as a dotted line. Results shown are for the situation where there are errors in allele 
frequency estimation, but results are similar when allele frequencies are known exactly. 
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Fig. 5. — Original priors (dotted lines) and distribution of posterior expectations (solid lines) 
for the three parameters of the approximate model. This suggests more accurately specified 
priors (dashed lines) as described in the text. 
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Fig. 6. — Example simulated datasets with (? = 4 and where allele frequences are known 
exactly. Points arc — log^^Q p for single point tests. Dotted lines are posterior density and 
solid lines are posterior density with McPeek-Strahs correction. Vertical dashed lines show 
position of disease QTL. 
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Fig. 7. — The same simulated datasets as shown in figure 6, but with errors in allele frequency 
estimation with a = 0.2 PGR cycles and r = 2 experimental replicates. Points are — logj^gP 
for single point shrunk (Visscher and Le Hellard 2003) tests. Dotted lines are posterior 
density and solid lines are posterior density with MePeek-Strahs correction. Vertical dashed 
lines show position of disease QTL. 
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Fig. 8. — Nominal and achieved coverage of credibility intervals for position of QTL, when 
allele frequencies are known exactly. Credibility intervals were constructed either without 
(dotted line) or with (solid line) the approximate correction factor of McPeek and Strahs 
(1999). 
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Fig. 9. — Nominal and achieved coverage of credibility intervals for position of QTL, when 
there are errors in allele frequency estimation, with a = 0.2 and r = 2. Credibility intervals 
were constructed either without (dotted line) or with (solid line) the approximate correction 
factor of McPeek and Strahs (1999). 
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Fig. 10. — Analysis of data of Hosking et al. (2002; top panel), and quasi-synthetic datasets 
generated by randomly relabelling controls clS CclSGS with probability 10%, 20% or 30% (lower 
three panels, top to bottom). Points are — log^^Q (p) from single point tests, and dashed 
and solid lines are the marginal posterior for disease gene position, without and with the 
correction factor of (McPeek and Strahs 1999). Vertical dashed lines show the true position 
of CYP2D6. 



